-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Conversation
Request for data collection review formAll questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.
|
Data review for @liuche |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a glean
perspective, this looks good. I left what I hope is a helpful comment on the GleanMetricsService.kt
file.
|
||
override fun shouldTrack(event: Event): Boolean = Settings.getInstance(context).isTelemetryEnabled | ||
override fun shouldTrack(event: Event): Boolean { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glean does this internally based on a persistent internal flag that can be toggled using the Glean.setUploadEnabled()
function. The idea was to toggle the flag when the user opted out of telemetry and let the glean library handle discarding recorded data and preventing upload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@travis79 Good to know, the secondary case for this method is we need a way to allow/block some metrics from being sent to some providers to support Leanplum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then that's a good case for doing this in the client app. Glean currently doesn't have a way to disable a single specific metric at run-time, only through the metrics.yaml file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably wouldn't be too hard to make that work -- but maybe it doesn't buy much...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@travis79 @mdboom If you look at the code in metrics.kt
you can see how I have it setup. Right now I have a generic Event
type for things that I want to track inside Fenix. When we track an event it dispatches each event to a MetricsService
which decides whether or not it wants to track that event and then transforms is appropriately into a format that service expects.
// Interaction Events | ||
data class SearchBarTapped(val source: Source) : Event() { | ||
enum class Source { HOME, BROWSER } | ||
override val extras: Map<String, String>? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads up -- the extras
map will become Map<Enum, String>
in the next android-components release. This lets us check that the keys are valid at compile time rather than run time. Unfortunately, it's a breaking change to the API -- and you're our first external user (lucky you!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually really like this change 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some changes, mostly de-duplicating documentation (which is good! writing docs once is better)
Thanks for being the guinea pig for Glean SDK usage and documentation. I think parts are working, but would be happy to see some clearer documentation on the Glean SDK side so future consumers of the library have a clearer understanding. @mdboom is that something we could work on for future consumers of Glean? I think this is a valuable first pass!
data_reviews: | ||
- N/A | ||
- https://github.com/mozilla-mobile/fenix/pull/1067#issuecomment-474598673 | ||
notification_emails: | ||
- [email protected] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be useful to have a Product/Team email here as well, to be aware of expiry?
app/metrics.yaml
Outdated
data_reviews: | ||
- N/A | ||
- https://github.com/mozilla-mobile/fenix/pull/1067#issuecomment-474598673 | ||
notification_emails: | ||
- [email protected] | ||
expires: never |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of expiry, I'm not a fan of "never" expiry - is setting this to 6mo-1yr acceptable? It's also a regular check for whoever the product manager is to be aware of what data we're collecting.
cc @mdboom on seeing this "never" pop up from the sample code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I'm not even sure what expiry would mean here, @mdboom is this a date field? Would be really helpful to have this in the sample yaml file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liuche https://mozilla.github.io/glean_parser/metrics-yaml.html#expires It looks like this field isn't for when the data expires, but when we should stop collecting this data.
Definitely an interesting idea! But I don't think we'll want this to expire?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, after re-reading I understand now! I will set this to a year. I'm not sure if there is a good mailing list to alert yet though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the correct documentation for that field. And as you point out, data retention is a separate issue (which we hope to address at some point...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdboom yeah, my point here was more to say, can you set the sample yaml file to actually use an expiration date, rather than never
? I think that's a huge problem when the "default" in the sample is "never expire this" 😓
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liuche I changed it to a date, specifically 1 year from now. I think we can talk with Product about this more later as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liuche: I'm not sure how to do that without breaking the sample, though. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked to @mdboom and he said he'll file a bug next week to set a date for the sample, and add a comment for any consumers of it to update the expiry date.
app/metrics.yaml
Outdated
@@ -0,0 +1,20 @@ | |||
# This file defines the metrics that are recorded by glean telemetry. They are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit do you need a license on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I assume this is just copied from the Glean sample file? I wonder if we should include the license there too @mdboom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
app/metrics.yaml
Outdated
description: > | ||
A User opened the app | ||
extra_keys: | ||
source: "The source from which the app was opened" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you be more specific about this, including all possible values this can be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see the higher detail later in the docs! I think the goal of Glean is to avoid having to write those docs anymore, so put all that detail and values and explanation right here in the yaml file itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 whenever possible
@@ -52,6 +52,16 @@ sealed class Event { | |||
get() = mapOf("source" to source.name) | |||
} | |||
|
|||
data class EnteredUrl(val autoCompleted: Boolean) : Event() { | |||
override val extras: Map<String, String>? | |||
get() = mapOf("autocomplete" to autoCompleted.toString()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I don't love all these booleans-to-strings, because it's less clear what is happening. @mdboom can Glean support non-string, string maps? I thought maps could be printed out as strings without every item being a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5b34874
to
1feea32
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data-review+
One tiny nit about LP + typo, and @boek please remember to add a Product email to the expiry notifications (or file a bug if necessary to track).
Also @mdboom to add a bug to change expiry: never
to an actual date + comment in the sample.yaml code (please link here).
Data Review Form
-
Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?
Yes, documented by Glean SDK in metrics.yaml, and can be scraped into docs page in Automate Fenix telemetry documentation to probe-scraper #1156 -
Is there a control mechanism that allows the user to turn the data collection on and off?
Yes, Fenix has data collection toggle in-app -
If the request is for permanent data collection, is there someone who will monitor the data over time?**
Expires in 1 year, open for renewal -
Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under? **
Type 2, interaction: app opened, search bar tapped, entered url, performed search, default browser, with either booleans or a set of pre-defined event strings. -
Is the data collection request for default-on or default-off?
Default on -
Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?
No -
Is the data collection covered by the existing Firefox privacy notice?
Yes -
Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)**
No, will expire - but @boek needs to add another Product email to these probes so they get notified when the probe collection is about to expire. -
Does the data collection use a third-party collection tool?
No, uses Glean
docs/telemetry.md
Outdated
Fenix sends event pings that allows us to measure feature performance. These are defined inside the It is defined inside the [`metrics.yaml`](https://github.com/mozilla-mobile/fenix/blob/master/app/metrics.yaml) file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: typo in this sentence
Also, you should keep the app_opened ping because the Leanplum key is not documented in Glean I assume, and #981 is still open.
Tracking notification reminder here: #1157 |
No description provided.